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Abstract. We count mxn non-negative integer matrices (contingency tables) with 
prescribed row and column sums (margins). For a wide class of smooth margins we 
establish a computationally efficient asymptotic formula approximating the number 
of matrices within a relative error which approaches as m and n grow. 



1. Introduction and main results 
Let R = (ri, . . . , r m ) and C = (ci, . . . , c n ) be positive integer vectors such that 



> 

. n + . . . + r m = a + . . . + c n = N. 



We are interested in the number C) of m x n non- negative integer matrices 
D = (dij) with row sums R and column sums C. Such matrices D are often called 
contingency tables with margins (R : C) . The problem of computing or estimating 
#(R,C) efficiently has attracted considerable attention, see, for example, [B+72], 
[Be74], [GC77], [DE85], [DG95], [D+97], [Mo02], [CD03], [C+05], [CM07], [GM07], 
[B+08], [Z+09] and [Ba09]. 

Asymptotic formulas for numbers #(R, C) as m and n grow are known in sparse 
cases, where the average entry N/mn of the matrix goes to 0, see [B+72], [Be74], 
[GM07] and in the case when all row and all column sums are equal, t\ = . . . = r m 
and ci = ... = c n , [CM07]. In [Ba09] an asymptotic formula for log #(R,C) is 
established under quite general circumstances. 

In this paper, we prove an asymptotic formula for #(R, C) for a reasonably wide 
class of smooth margins (R, C). In [BH10] we apply a similar approach to find an 
asymptotic formula for the number of matrices [binary contingency tables) with row 
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sums R, column sums C and 0-1 entries as well as to find an asymptotic formula 
for the number of graphs with prescribed degrees of vertices. 

(1.1) The typical matrix and smooth margins. The typical matrix was in- 
troduced in [Ba09] and various versions of smoothness for margins were introduced 
in [B+08] and in [Ba08]. The function 

g(x) = (x + 1) ln(x + 1) — x In x for x > 

plays the crucial role. It is easy to see that g is increasing and concave with g(0) = 0. 
For an m x n non- negative matrix X = (xjk) we define 

g(X)= ^2 9(%jk)= ((£jfc + 1 )M^jfc + 1 )-£jfc m ^jfc)- 

l<j<m l<j<m 
l<k<n l<k<n 

Given margins (R, C), let P(R, C) be the polytope of all real non-negative mxn 
matrices X = (xjk) with row sums R and column sums C, also known as the 
transportation polytope. We consider the following optimization problem: 

(1.1.1) Find max g(X). 

Since g is strictly concave, the maximum is attained at a unique matrix Z = (Qk), 
which we call the typical matrix with margins (R, C). One can show that Qk > 
for all j and k, see [B+08] and [Ba08]. In [Ba08] it is shown that a random contin- 
gency table, sampled from the uniform distribution on the set of all non-negative 
integer matrices with row sums R and column sums C is, in some rigorously defined 
sense, likely to be close to the typical matrix Z. In [BH09] we give the following 
probabilistic interpretation of Z. Let us consider the family of all probability dis- 
tributions on the set Z^ Xn of all non- negative mxn integer matrices with the 
expectations in the affine subspace A(R, C) of the mxn matrices with row sums R 
and column sums C. In this family there is a unique distribution of the maximum 
entropy and Z turns out to be the expectation of that distribution. The maximum 
entropy distribution is necessarily a distribution on Z^ Xn with independent geo- 
metrically distributed coordinates, which, conditioned on A(R, C) , results in the 
uniform distribution on the set of contingency tables with margins (R, C) . Func- 
tion g(X) turns out to be the entropy of the multivariate geometric distribution on 
xn with the expectation X. 

Let us fix a number < 5 < 1. We say that margins (R, C) are 5- smooth 
provided the following conditions (1.1.2)-(1.1.4) are satisfied: 

(1.1.2) m > 5n and n > dm, 
so the dimensions of the matrix are of the same order; 



(1.1.3) 



St < Qk < t for all j and k, 
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for some r such that 



(1.1.4) 



r > 5 



We note that 5-smooth margins are also ^'-smooth for any < 5' < 5. 

Condition (1.1.3) requires that the entries of the typical matrix are of the same 
order and it plays a crucial role in our proofs. Often, one can show that margins 
are smooth by predicting what the solution to the optimization problem (1.1.1) 
will look like. For example, if all row sums Tj are equal, symmetry requires that 
we have Cjk = Ck/m for all j and k, so the entries of the typical matrix are of the 
same order provided the column sums cj~ are of the same order. On the other hand, 
(1.1.3) is violated in some curious cases. For example, if m = n and r\ = . . . = 
r n _i = ci = . . . = c n -i = n while r n = c n = 3n, the entry ( nn of the typical 
matrix is linear in n, namely C, nn > 0.58n, while all other entries of Z remain 
bounded by a constant, see [Ba08]. If we change r n and c n to 2n, the entry Q nn 
becomes bounded by a constant as well. One may wonder (this question is inspired 
by a conversation with B. McKay) if the smoothness condition (1.1.3) is indeed 
necessary for the number of tables #(-R, C) to be expressible by a formula which 
varies "smoothly" as the margins R and C vary, like the formula in Theorem 1.3 
below. In particular, can there be a sudden jump in the number of tables with 
m = n, r\ = . . . = r n _i = c\ = . . . = c n _i = n when r n = c n crosses a certain 
threshold between 2n and 3n? 

In [B+08] it is proven that if the ratio of the maximum row sum r+ = maxj Tj 
to the minimum row sum r_ = minj rj and the ratio of the maximum column sum 
c_|_ = maxfc Cfc to the minimum column sum c_ = min/j do not exceed a number 
P < (l + VS)/2« 1-618, then (1.1.3) is satisfied with some 6 = 6(J3) > 0. The 
bound (1 + \/E)/2 is not optimal, apparently it can be increased to 2, see [Lu08]. 
It looks plausible that if the margins are of the same order and sufficiently generic 
then the entries of the typical table are of the same order as well. 

The lower bound in (1.1.4) requires that the density N/mn of the margins, that is 
the average entry of the matrix, remains bounded away from 0. This is unavoidable 
as our asymptotic formula does not hold for sparse cases where N/mn — > 0, see 



We proceed to define various objects needed to state our asymptotic formula. 

(1.2) Quadratic form q and related quantities. Let Z = (Cjk) De the typ- 
ical matrix defined in Section 1.1. We consider the following quadratic form 



[GM07]. 





(1.2.1) 



l<j<m 
l<k<n 



S = (si,... , 
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s m ) and t = (ti, ... ,t n ). 



Thus q is a positive semidefinite quadratic form. It is easy to see that the null-space 
of q is spanned by vector 



u 




Let H = u , H C R m+n , be the orthogonal complement to u. Then the restriction 
q\H is a positive definite quadratic form and hence we can define its determinant 
det q\H that is the product of the non-zero eigenvalues of q. Let us define polyno- 
mials /, h : R m+n — > R by 

f( s '^ = l E c^(c Jfc + i)(2c Jfc + i)(^+^) 3 

1< j<m 
l<A;<n 

(1.2.2) and 

h (s,t) = ^ Yl oac Jfc + i)(6c| fc + 6o fc + i)(^+t fc ) 4 . 

1< j<m 
l<fc<n 

We consider the Gaussian probability measure on H with the density proportional 
to e~ q and define 

H = E f and v = Eh. 

Now we have all the ingredients to state our asymptotic formula for C). 

(1.3) Theorem. Let us fix < 5 < 1. Let (R,C) be 5-smooth margins, let the 
function g and the typical matrix Z be as defined in Section 1.1 and let the quadratic 
form q and values of \x and v be as defined in Section 1.2. Then the value of 



e 9(z)^ m + n 

; r~, ; exp 

( 47r )(m+n-i)/2 v / det? | jH - 



K + "} 



approximates C) within a relative error which approaches as m,n — > +oo. 
More precisely, for any < e < 1/2 the above expression approximates #(R,C) 
within relative error e provided 



m + n > - 



1 \ 



for some 7(5) > 0. 

In [CM07] Canfield and McKay obtain an asymptotic formula for #(R,C) in 
the particular case of all row sums being equal and all column sums being equal. 
One can show that our formula indeed becomes the asymptotic formula of [CM07] 
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when r\ = . . . = r m and c\ = . . . = c n . In [Ba09] it is proven that the value 
g(Z) provides an asymptotic approximation to In #(R,C) for a rather wide class 
of margins (essentially, we need only the density N/mn to be bounded away from 
but do not need a subtler condition (1.1.3) of smoothness). The first part 

2 ^ e 9 ^ z 'y/m + n 

(47r)H+"- 1 )/ 2 v /detQ| J ff 

of the formula is called the "Gaussian approximation" in [BH09]. It has the follow- 
ing intuitive explanation. Let us consider a random matrix X with the multivariate 
geometric distribution on the set Z™ Xn of all no n- negative integer matrices such 
that EX = Z, where Z is the typical matrix with margins (R, C). It follows from 
the results of [BH09] that the distribution of X conditioned on the affine subspace 
A = A(R, C) of matrices with row sums R and column sums C is uniform with 
the probability mass function of e~ 9 ^ for every non-negative integer matrix in A. 
Therefore, 

#(i2,C) = e 9{z ^{X eA). 

Let Y e M m+n be a random vector obtained by computing m row sums and n 
column sums of X. Then EY = (R, C) and 

P{X eA} = P{Y = (R, C)}. 

We obtain (1.3.1) if we assume in the spirit of the Local Central Limit Theorem that 
the distribution of Y in the vicinity of E Y is close to the (m + n — 1) -dimensional 
Gaussian distribution (we lose one dimension since the row and column sums of a 
matrix are bound by one linear relation: the sum of all row sums is equal to the 
sum of all column sums) . This assumption is not implausible since the coordinates 
of Y are obtained by summing up of a number of independent entries of X . 
The correction factor 

(1.3.2) exp |_| +z/ | 

is, essentially, the Edgeworth correction in the Central Limit Theorem. In the course 
of the proof of Theorem 1.3 we establish a two-sided bound 

7i W < ex p{-f +»} < 72(5) 

for some constants 71 (5), 72(5) > as long as the margins (R, C) remain 5-smooth. 

De Loera [D09a], [D09b] ran a range of numerical experiments which seem 
to demonstrate that already the Gaussian approximation (1.3.1) works reason- 
ably well for contingency tables. For example, for R = (220, 215, 93, 64) and 
C = (108,286,71, 127) formula (1.3.1) approximates #(R,C) within a relative er- 
ror of about 6%, for R = C = (300, 300, 300, 300) the error is about 12% while for 
R = (65205,189726,233525,170004) and C = (137007,87762,274082,159609) the 
error is about 1.2%. 

5 



(1.4) Computations and a change of the hyperplane. Optimization problem 
(1.1.1) is convex and can be solved, for example, by interior point methods, see 
[NN94]. That is, for any e > the entries Qk of the typical matrix Z can be 
computed within relative error e in time polynomial in ln(l/e) and m + n. 

Given Z, quantities det q\H, fx and v can be computed by linear algebra algo- 
rithms in O (m 2 n 2 ) time, since to compute the expectation of a polynomial with 
respect to the Gaussian measure one only needs to know the covariances of the vari- 
ables, see Section 4.2. It may be convenient to replace the hyperplane H C R m + n 
orthogonal to the null-space of q by a coordinate hyperplane L C M rn+n defined 
by any of the equations Sj = or t k = 0. Indeed, if L C R m+n is any hyperplane 
not containing the null-space of q, then the restriction q\L is strictly positive defi- 
nite and one can consider the Gaussian probability measure in L with the density 
proportional to e~ q . We prove in Lemma 3.1 below that the expectation of any 
polynomial in sj + tk does not depend on the choice of L and hence fi and v can 
be defined as in Section 1.2 with H replaced by L. We describe the dependence 
of detg|L on L in Lemma 3.5. In particular, it follows that if L is a coordinate 
hyperplane then det q\H = (m + n) det q\L. 

If we choose L to be defined by the equation t n = then we have an explicit 
formula for the matrix Q of q\L as follows: 

q(x) = -(x,Qx) for x = (s lt . . . , s m ; t lt . . . , £ n _i) , 

where (•, •) is the standard scalar product and Q = (qu) is the (m+n—1) x (m+n— 1) 
symmetric matrix, where 

Qj(k+m) = q(k+m)j =Cjk + Cjk for j = 1, . . . , m and k = 1, . . . , n - 1, 

n 

k=i 

n 

q(k+ m )(k+m) =Ck + /^2(jk for k = l,... ,n-l, 

3 = 1 

and all other entries qu are zeros. Then detg|L = 2 1 ~ m ~ n detQ and Q~ x is the 
covariance matrix of si, . . . , s m ; ti, . . . , £ n -i- 

2. An integral representation for the number of contingency 
tables and the plan of the proof of theorem 1.3 

In [BH09] we prove the following general result. 

(2.1) Theorem. Let P C M. p be a polyhedron defined by the system of linear 
equations Ax = b, where A is a d x p integer matrix with columns ai,...,a p G 
Z d and b e Z d is an integer vector, and inequalities x > (the inequalities are 
understood as coordinate-wise) . Suppose that P is bounded and has a non-empty 
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interior, that is, contains a point x = (£1, . . . , £ p ) such that £j > for j = 1, . . . , p. 
Then the function 

^) = E(fe + l)ln(0 + l)-01nO) 

attains its maximum on P at a unique point z = (£1, . . . , ( p ) such that (j > for 
j = 1,... ,p. 

Let II C 1R 6e the parallelepiped consisting of the points t = (ti, . . . , r^) sitc/i 

— 7T < r/c < tv for k=l,...,d. 
Then the number \P fl Z p \ of integer points in P can be written as 

p9(z) f P 1 

\P n ZP\ = ^— 7 / e"^*' 6 ) n \ ., tt <ft, 

1 1 (27r)^y n 1 = 1 1 + Cj - Cj-e*<«i.*> 

where (•, •) is i/ie standard scalar product in R d and i = T. 

□ 

The idea of the proof is as follows. Let X = (x±, . . . , x p ) be a random vector 
of independent geometric random variables Xj such that Exj = Q. Hence values 
of X are non-negative integer vectors and we show in [BH09] that the probability 
mass function of X is constant on the set PnV and equals e~ 9 ^ for every integer 
point in P. Letting Y = AX, we obtain 

\P n Z p | = e 9(z) P {X eP} = e g{z) P {Y = b} 

and the probability in question is written as the integral of the characteristic func- 
tion of Y. 

Since 

p 
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in a neighborhood of the origin t = the integrand can be written as 

p 



* 1 

n i 

r ! p 



(2.1.1) - e (0 + 1) (2C, + 1) (a,,*) 3 

1 P 

+ 24 E (0 + 1) (6< j + 6C, + 1) (aj, t)' 

+ 0^£(<i + I) 5 

Note that the linear term is absent in the expansion. 
We obtain the following corollary. 

(2.2) Corollary. Let R = (ri, . . . ,r m ) and C = (ci, . . . , c n ) be margins and let 
Z = (Cjk) be the typical matrix defined in Section 1.1. Let 

{m n j ^ 

-^ ¥3 -^ Cfe d n i + Cjk -c Jk e^r 
j = l k=l I l<j<m ^ Sjfc Sjfc 

' l<fc<n 

Le£ IT C R m+n be the parallelepiped consisting of the points (si, . . . , s m ; £i, . . . , t n ) 
such that 

—tv < Sj,t k < tv for all j,k. 

Let us identify W n+n ~ 1 with the hyperplane t n = in ]R m + n - 1 and let Ho C IT be 
the facet ofH defined by the equation t n = 0. T/ien 

,g{Z) 



M R > Cr > = (2Tv)m + n-l J Uo F ^ d8dt ' 



where dsdt is the Lebesgue measure in Hq. 

Proof. The number #(R, C) of no n- negative integer mxn matrices with row sums 
R and column sums C is the number of integer points in the transportation polytope 
P(R, C). We can define P(R, C) by prescribing all row sums r\, . . . ,r m and all but 
one column sums ci, . . . , c n _i of a no n- negative mxn matrix. Applying Theorem 
2.1, we get the desired integral representation. □ 



From (2.1.1) we get the following expansion in the neighborhood of s± = . . . = 
= t-\ = . . . = t„ = 0: 



F(s, t) = exp<^ — q(s, t) — if(s, t) + h(s, t) 

(2.2.1) 



+o^E(i+c-fc) 5 («i+**) 5 j|, 



where q, /, and h are defined by (1.2.1)-(1.2.2). 

(2.3) The plan of the proof of Theorem 1.3. First, we argue that it suffices to 
prove Theorem 1.3 under one additional assumption, namely, that the parameter 
t in (1.1.3) is bounded by a polynomial in m + n: 

t < (m + n) 1 / 5 for some 5>0 

(for example, one can choose 5 = 1/10). Indeed, it follows by results of [D+97] (see 
Lemma 3 there) that for r > (mn) 2 the (properly normalized) volume vol P(R,C) 
of the transportation polytope approximates the number of tables #(R, C) within 
a relative error of O ((m + n) _1 ). Since dim P(R,C) = (m — l)(n — 1) and 

vol P(aR, aC) = vol P(R, C) for a > 0, 

to handle larger r it suffices to show that the formula of Theorem 1.3 scales the 
right way if the margins (R, C) get scaled (R, C) \ — > (aR, aC) (and appropriately 
rounded, if the obtained margins are not integer). If r is large enough then scaling 
results in an approximate scaling e 9<yZ ^ i — > a mn e 9 ^ z \ q i — > a 2 q, f i — > a 3 f and 
h i — > a A h and hence the value produced by the formula of Theorem 1.3 gets 
multiplied by roughly desired. We provide necessary details in 

Section 8. 

To handle the case of r bounded by a polynomial in m + n, we use the integral 
representation of Corollary 2.2. 

Let us define a neighborhood U C n of the origin by 

U= j(si,... ,s m ;*i,... ,* n -i) : \sj\,\t k \ < ln ( m + n ) for all j,k\. 
I T\Jm + n J 

We show that the integral of F(s, t) over Ho\U is asymptotically negligible. Namely, 
in Section 7 we prove that the integral 



/ 



\F(s,t)\ dsdt 



\u 
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is asymptotically negligible compared to the integral 



(2.3.1) / \F(s,t) \ dsdt. 

Ju 

In Section 6, we evaluate the integral 

(2.3.2) I F(s,t) dsdt 

Ju 



and show that it produces the asymptotic formula of Theorem 1.3. In particular, 
we show that (2.3.1) and (2.3.2) are of the same order, that is, 



\F(s,t)\ dsdt < 7(5) 



u 



F(s,t) dsdt 



u 



for some constant ^(5) > 1. Hence the integral of F(s,t) outside of U is indeed 
asymptotically irrelevant. 

From (2.2.1), we deduce that 

F(s,t) & exp{-q(s,t) - if(s,t) + h(s,t)} for (s,t) eU, 

where q is defined by (1.2.1) and / and h are defined by (1.2.2), so that the contri- 
bution of the terms of order 5 and higher in (2.2.1) is asymptotically negligible in 
the integral (2.3.2). The integral of e~ q over U produces the Gaussian term (1.3.1) 
However, both the cubic term f(s,t) and the fourth-order term h(s,t) contribute 
substantially to the integral, correcting the Gaussian term (1.3.1) by a constant 
factor. 

Let us consider the Gaussian probability measure in the coordinate hyperplane 
t n = 0, which we identify with R m+n_1 , with the density proportional to e~ q . In 
Section 5, we show that with respect to that measure, h(s, t) remains, essentially, 
constant in the neighborhood U: 

h(s,t) « ~Eh = v almost everywhere in U. 

This allows us to conclude that asymptotically 

/ exp{— q(s, t) — if(s, t) + h(s, £)} dsdt w e v / exp {— q(s, t) + if(s, t)} dsdt. 
Ju ' Ju 

In Section 4, we show that f(s,t) behaves, essentially, as a Gaussian random 
variable with respect to the probability measure in M m + n_1 with the density pro- 
portional to e~ q , so 

/ exp{— q(s, t) — if(s, t) } dsdt ~ / exp{— q(s, t) — if(s, t) } dsdt 

Ju Jm m + n ~ 1 

^exp(--E/ 2 j f e~ qi - s ^ dsdt, 
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which concludes the computation of (2.3.2). 

The results of Sections 4 and 5 are based on the analysis in Section 3. In Section 
3, we consider coordinate functions Sj and tk as random variables with respect to 
the Gaussian probability measure on a hyperplane L C R m+n not containing the 
null-space of q with the density proportional to e~ q . We show that Sj 1 + t^ and 
sj 2 + tk 2 are weakly correlated provided ji ^ j 2 and k\ ^ k 2 , that is, 

E \( s h + *fci) ( s j 2 + tk 2 )\ =0 { ) provided ji ^ j 2 and h ^ k 2 and 



E 



-O I — 1 for all j, fc. 

m + n 



(2.4) Notation. In what follows, we denote by 7, sometimes with an index or a 
list of parameters, a positive constant depending on the parameters. The actual 
value of 7 may change from line to line. The most common appearance will be 
7(5), a positive constant depending only on the 5-smoothness constant 5. 

As usual, for two functions / and g, where g is non-negative, we say that / = 0(g) 
if l/l < 19 f° r some constant 7 > and that / = ft(g) if / > 7<7 for some constant 
7 > 0. 

3. Correlations 

Recall (see Section 1.2) that the quadratic form q : R m + n — > R is defined by 
<l( s i t ) = \ Yl (tfk + Qk) (sj +t k ) 2 for (s,t) = (s 1 ,...,s m ;t 1 ,...t n ). 

l<j'<m 
l<fc<n 

Let 




Let L C R m+n be a hyperplane which does not contain u. Then the restriction q\L 
of q onto L is a positive definite quadratic form and we can consider the Gaussian 
probability measure on L with the density proportional to e~ q . We consider Sj and 
tk as random variables on L and estimate their covariances. 

(3.1) Lemma. For any 1 < ji,j 2 < th and any 1 < k\ , k 2 < n the covariance 

independent on the choice the hyperplane L, as long as L does not contain u. 

Proof. Let L±,L 2 C R n be two hyperplanes not containing u. Then we can define 
the projection pr : L\ — > L 2 along the span of u, so that pr(x) for x G L\ is the 
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unique y G L 2 such that y — x is a multiple of u. We note that q(x) = q(x+tu) for all 
x G M. m+n and all t G R. Therefore, the push-forward of the Gaussian probability 
measure on L\ with the density proportional to e~ q is the probability measure on 
L 2 with the density proportional to e~ q . Moreover, the value of Sj + tk does not 
change under the projection and hence the result follows. □ 

The main result of this section is the following theorem. 

(3.2) Theorem. Let us fix a number 5 > and suppose that 

t~S < Cjfc < t for all j, k 

and some r > 0. Suppose that dm < n and 5n < m. 
Let us define 



a o = + Cjk) for j = 1, . . • , m and 



k=i 

m 



Let 
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5 15 / 2 (t 2 +t) 



ran 



Let L C M m+n be a hyperplane not containing the null-space of q. Let us consider 
the Gaussian probability measure on L with the density proportional to e~ q . 
Then 



E (sji +tk) (s h +t k ) - — 

Ok 



e ( Sj + t k y — 

aj b k 



< A provided j\ ^ j 2 o,nd k\ 7^ k 2 , 

— < A provided k\ 7^ k 2 , 
aj 

< A provided ji 7^ j 2 and 

< A for all j and k. 



The gist of Theorem 3.2 is that for a fixed 5 > 0, while generally the covariance 
of Sj 1 + ifcj and s j2 + tfc 2 is O (r~ 2 (m + n) -1 ), it is only O (r~ 2 (m + n)~ 2 ) when 
ji 7^ j 2 and k 2 . 

In what follows, we will often deal with the following situation. Let V be Eu- 
clidean space, let <fi : V — > R be a positive semidefinite quadratic form and let 
L C V be a subspace such that the restriction of cf> onto L is strictly positive definite. 
We consider the Gaussian probability measure on L with the density proportional 

12 



to e~^. For a polynomial (random variable) / : L — > R we denote by E (/; <f>\L) 
the expectation of / with respect to that Gaussian measure. Instead of E (/; <j)\V) 
we write simply E (/; <fi). 

We will use the following standard facts. Suppose that there is a direct sum 
decomposition V = L\ + L 2 + . . . + Lk where Li are pairwise orthogonal, such that 

fc 

4> {x\ + . . . + Xk) = ^ 4>i (xj) for all Xi e Li. 

i=i 

In other words, the components Xi G Li of a random point x = Xi + ... + Xk,xeV, 
are independent. Then for any two linear functions i\, £ 2 : V — > R we have 

k 

E (h£ 2 ; <P)=J2 E ^\ L i)- 

i=l 

Indeed, since 

k 

4,2 (xi + ... + Xk) = y~]tl,2 (Xi) , 

i=l 

we obtain 

k k 

E(^ 2 ; 0) = 5]5]E^ 1 (x il )^(x i2 ); 0). 

ii=l 12=1 

If h 7^ 12, we have 

E (£ 1 (x il )£ 2 (x i2 ); 0) = E(£ i; 0|LjJ E (£ 2 ; 0|£ <2 ) = 
while for %\ = i 2 = i we have 

E (£1 (x^ £2 (xi) ; 0) = E(^ 2 ; 0|L<) . 
We deduce Theorem 3.2 from the following statement. 
(3.3) Proposition. Let m and n be positive integers such that 

dm < n and 5n < m for some < 5 < 1. 
Let £jk, j = 1, • • • , m and k = 1, . . . , n, be real numbers such that 

ct < ijk < P for all j, k 

and some f3 > a > 0. Let 

n 

aj = ^2 &fc f° r 3 = !> • • • » m anrf 
fc=l 

6fc = 5^^-fc /or fe = l,...,n. 
13 



Let us define a quadratic form ip : R m + n — y R by 



l<7<m X V J v re 



L<j<m 
Kfc<n 



for (s,t) = (si,... ,s m ;*i,... ,t n ) . 

Let L C R m+n be the hyperplane consisting of the points (si, . . . , s m ; t±, . . . , t n ) 
such that 



j=l k=l 

Then the restriction ip\L of onto L is strictly positive definite and for 



A = 3 



/A 7/2 l 



ran 



we have 

|E ( 8 f, 1>\L)-1\, \B (tl iJj\L)-1\ < A forall j,k, 

|E (s^Sj^, ip\L)\, |E (t kl t k2 ; ip\L)\ < A forall ji ^ j 2 and k x ^ k 2 , 

|E (sjtk] ip\L)\ < A forall j,k. 

Proof. Clearly, the null-space of ip is one-dimensional and spanned by vector 



10 



We have L = w and hence the restriction of tp onto L is positive definite. 
Next, we observe that 



r o^ l \ \/bi 



i ■ ■ ■ ? v "n 



is an eigenvector of •0 with eigenvalue 1. Indeed, the gradient of ^(x) at x = v is 
equal to 2v: 



_d_ 



1> 



dtj 



la 
2 



^^^•fc = 2y / Oj and 



J fc=i 

m 



Sj=^/aj,t k =V~bk 

i— = ~~n = ^ £i fc = 2^/b~k~. 
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We write 

(3-3.1) lKM)=^£«? + iT,^ + E ^=7T^- 

k—1,... ,n 

Let 

c = ai + . . . + a m = hi + . . . + b n 
and let us consider another quadratic form 4> : W m+n — y M denned by 

(3.3.2) <p(s,t) = ^2 ^ CLj ( hk 's jt k for (s, t) = (s 1} . . . , s m ; t u . . . , £ n ) . 

l<j'<m 
l<fc<n 

Clearly, (j)(s,t) is a form of rank 2. Its non-zero eigenvalues are —1/2 with the 
eigenspace spanned by w and 1/2 with the eigenspace spanned by v. 
Let us define a subspace Lq C R m + n of codimension 2 by 

L = (v, ty)" 1 . 

In other words, L consists of the points (si, . . . , s m ; t±, . . . ,t n ) such that 



Sjy/aJ = y^JkVh = 0. 
J=l fc=l 

In particular, 

(p(s,t) = for all (s,f)eL . 
Let us define a quadratic form 

(3.3.3) ^ = ^-£20 for 

We note that ?/> is strictly positive definite. Indeed, w and v are eigenvectors of ip 
with the eigenvalues e 2 /2 > and 1 — e 2 /2 > respectively and ip coincides with 
ip on the subspace L$ = (v,w) , where ip is positive definite. Our immediate goal 
is to bound the covariances 

E {s h Sj^ , E (t kl t k2 ; ifj and E (sjt k ; V>) ■ 

We can write 

$(x) = ±(x, (I + P)x) for x = (s,t), 
15 



where I is the (m + n) x (m + n) identity matrix, P = (pu) is a symmetric 
(m + n) x (m + n) matrix with zero diagonal and (•, •) is the standard scalar product 

inR m+n_ g ince 

an < cij < f3n for j = 1, . . . , m 

(3.3.4) am < bk < /3m for k = 1, . . . ,n and 

c > amn, 

by (3.3.1) - (3.3.3), for the entries pu of P we have 

(3.3.5) < pu < — y — = — - — for all i, I. 

a^/mn e^/mn 

Furthermore, v is the Perron-Frobenius eigenvector of P with the corresponding 
eigenvalue 1 — e 2 . 

Let us bound the entries of a positive integer power P d = (pu^ °f P- Let 

K = a 3/ 2(5 l/f (mn) 3/4 alldlet f = " V > y=(Vl,..- ,Vm+n). 

From (3.3.4) we conclude that 



«7, frfc > av&rrvn for all j,k 

and hence by (3.3.5) 

(3.3.6) pu < rji for all z, /. 

Similarly, from (3.3.4), we conclude 



bk < /3a/ mn/5 for all j, /c 

and hence 

(3.3.7) n < — f = for all i. 

Besides, y is an eigenvector of P d with the eigenvalue (1 — e 2 ) d . Therefore, for d > 
we have 



(d+1) (d) 

Pa -Z^PijPji 



<(l-e 2 ) d ' 



e 3 / 2 \ / 5mn 
16 



Consequently, the series 



+00 



(i+p)- 1 = i +j2(- i ) dpo 



d=i 



converges absolutely and we can bound the entries of Q = (I + P) 1 , q = (qu), by 

1 1 1 



9*1 ^ 



9« - 1 ^ 



e 2 e 3 / 2 VSmn e 7 / 2 VSmn 
and 

1 



if z^Z 



ran 



On the other hand, Q is the matrix of covariances of functions s\, . . . ,s m ;ti,... ,t n , 
so we have 



E (^; 

E (s^Sja; ^) 



(3.3.8) 



(t kl tk 2 ; V>) 
and 



E 



E Sjtfc; V 



< 



< 



E (t£; V] - I 
1 



< 



1 



e r / 2 \/5mn 
1 



< 



e 7 / 2 \ / dmn 
1 



,7/2^ 



mn 



e 7 / 2 VSmn 
if fei 7^ fc 2 



for all j, /c. 



for all j, /c, 



Now we go from ^ back to ip. Since i> and w are eigenvectors of ip and since 
L = (v, w) ± , for any linear functions £1,^2 : M m+n — > K. we have 

E (^i£ 2 ; ^|Lo) =E (e&i ^) -E (V2; V»| span(«;)) -E (^ 2 ; $ span(v)) . 

On the other hand, since ip and i/> coincide on Lq, we have 

(£^2; 1>\L )= E (^2; ^|Lo). 



E 



Finally, since i> is an eigenvector of ip and Lq is the orthogonal complement to v in 
L, we have 



Therefore, 



(3.3.9) 



E (£ 1 £ 2 ; $\L) = E {i x i 2 - ^\L ) + E {i x i 2 - i>\ span(t;)) . 

E (^ 2 ; Vl^) =E (^2; $) -E ^i£ 2 ; $span(«;)) 

-E ^i£ 2 ; */>|span(v)) +E (^ 2 ; V|span(u)). 
17 



We note that the gradient of function Sj restricted onto span(w) is ^/a]j2c. Since 
w is an eigenvector of tp with eigenvalue e 2 /2, we have 



(E Sjl s j2 ; ^|spanH) = -|^ < for all j u j 2 . 

Similarly, 

(Et kl tk 2 ; V|span(w)) = < * for all fci,fc 2 

V ~ / 2e^c 2e 3 VSmn 

and 

( Es ,t fc ; ^|spanH)=-^g5 > for all 

Since i> is an eigenvector of tj) with eigenvalue 1 — e 2 /2>l/2, we obtain 



( Es , lS , 2 ; t\spMv)) = Hl _y/ 2)c < for all ^ 2 . 

Similarly, 

(~Et kl t k2 ; V|span(-i>)) = ^ &fcl& ^ _ < * for all fci,fc 2 



and 



(E^ fc ; ^|span(,)) = --^_ < _L= for all j,k. 



4(l-e 2 /2)c ~ 2 ev ^rm 
Since v is an eigenvector of V with eigenvalue 1, we get 



(E s h s h ; VI span(v)) = ^E^iL < — 1== for all j x ,j 2 . 

4c 4evomn 



Similarly, 



(Et fcl tfc 2 ; VI span(f)) = "^ bklb tl < 2__ for all h,k 2 

4c 4eVSmn 



and 



\Zdjbk 1 

(Es,-* fc ; VI span(w)) = < - — == for all j,k. 

4c 4evmn 



Combining (3.3.8) and (3.3.9), we complete the proof. □ 

Now we are ready to prove Theorem 3.2. 
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(3.4) Proof of Theorem 3.2. Let us define 

€jk = (% + (jk for all j, k. 



Hence we have 



< £jk < P for all j, k, where 
a = t5 + t 2 5 2 and (3 = r + r 2 . 



We have 



(3.4.1) g= T + r2 =^±L- < 1 



Let 



In particular, we have 



n m 

aj = ^2^jk and fr fc = ^£jfc- 
fe=i j'=i 



< (V + r 2 ) n for 7 = 1, .... m and 
(3.4.2) J " 2 

6^ < [t + r ) m for fe = 1, . . . , n. 



We apply Proposition 3.3 to the quadratic form 



2 Vh- 

l<J<m v v 

l<fc<n 

and the hyperplane L\ C M m + n defined by the equation 



j'=i i=i 
Let us consider a linear transformation 

(si, . . . , s m ; ti, . . . , t n ) 1 — > (si-^/ai, • • • i s m\ / (h^', tiy/bi, 
and the hyperplane Li C R m + n defined by the equation 



i=i fc=i 
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Then L 2 is mapped onto L\ and the push-forward of the Gaussian probability 
measure on L 2 with the density proportional to e~ q is the Gaussian probability 
measure on L\ with the density proportional to . 
We have 

E (s h s h ; q\L 2 ) = E {s n s h ; ^|Li) for all j 1 ,j 2 , 

(3.4.3) E (t fcl t fc2 ; g|L 2 ) = -^=L=E (t kl t k2 ; V'l^i) for all fci,fc 2 , and 
E (sjtfc; g|L 2 ) = J— E (sj* fc ; ^|^i) for all j, fc. 

By (3.4.1), we have (3 /a < 5~ 2 . Since by Lemma 3.1, for any hyperplane L C M m + n 
not containing u we have 



E 



({sj 1 +t kl )(s h +t k2 ); q\L) = E ^(s h +t kl ) (s h +t k2 ) ; g|L 2 ), 



the proof follows by Proposition 3.3 applied to L\ and -0 and (3.4.1)-(3.4.3). □ 
We will need the following result. 

(3.5) Lemma. Let V be Euclidean space and let q : V — > K be a quadratic form 
such that rankg = dimF — 1. Let v G V be the unit eigenvector of q with the 
eigenvalue and let H = v 1 - be the orthogonal complement of v. Then for a unit 
vector u G V we have 

detglw^ = (u, v) 2 det q\H. 

Proof. This is Lemma 2.3 of [B97b]. □ 

We apply Lemma 3.5 in the following situation. Let V = W m+n and let q be 
defined by (1.2.1). Let L be a coordinate hyperplane defined by one of the equations 
Sj = or t k = 0. Then 

det q\L = det q\H. 

m + n 

In particular, the value of det(/|L does not depend on the choice of the coordinate 
hyperplane. 

Finally, we need the following result. 

(3.6) Lemma. Let qo : R m+n — y R be the quadratic form defined by the formula 

l<j<m 
Kk<n 



Then the eigenspaces of qo are as follows: 
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The 1- dimensional eigenspace E\ with the eigenvalue spanned by vector 

u = I 1, • • • , 1 ; -1, ■ . . , -1 I ; 

\ m times n times / 

The (n — 1)- dimensional eigenspace E% with the eigenvalue m/2 consisting of the 
vectors such that 

n 

tk = and si = . . . = s m = 0; 

fc=i 

The (m — 1)- dimensional eigenspace E3 with the eigenvalue n/2 consisting of the 
vectors such that 

m 

Sj = and t\ = . . . = t n = 

and 

The 1-dimensional eigenspace E4 with the eigenvalue (m + n)/2 spanned by vector 



v = I n, . . . , n; m, . . . ,m 

m times n times 

Proof. Clearly, E\ is the eigenspace with the eigenvalue 0. It is then straightforward 
to check that the gradient of qo at x = (s,t) equals mx for x G E2, equals nx for 
x E E 3 and equals (m + n)x for x £ E 4 . □ 

4. The third degree term 

In this section we prove the following main result. 

(4.1) Theorem. Let Ujk, j = 1,... ,m and k = 1,... ,n be Gaussian random 
variables such that 

Ewjfc = for all j, k. 
Suppose further that for some 9 > 

2 
Em -l < for all j,k and 

J m + n 

Q 

\Eu jlkl u hk2 \ < 7 ■ — provided ji ^ j 2 and k x ^ k 2 . 

(m + n) z 



Let 

TT \ ^ „, 

l jk- 



1< j<m 
l<k<n 
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Then for some constant 7(6*) > and any < e < 1/2 we have 



E exp{ill} — exp 



— EU' 

2 



< e 



provided 



Besides, 



m + n > I - 



7(0) 



E?7 2 < 7 (0) 



/or some constant 7(6*) > 0. Here i = y/— 1. 



We will apply Theorem 4.1 in the following situation. Let g : R m + n — >■ H. be the 
quadratic form defined by (1.2.1). Let L C IR m+n be a hyperplane not containing 
the null-space of q. Let us fix the Gaussian probability measure on L with the 
density proportional to e~ q . We define random variables 



_ a/ Qfc (Cjfe + 1) (2Qfc + 1) , , 

- y g ( s i + ifcj , 

where si, . . . ,s m ;t\,... ,t n are the coordinates of a point in L. Then we have 



l<j<m 
l<fc<n 

for / defined by (1.2.2). 

(4.2) The expectation of a product of Gaussian random variables. We 
will use the famous Wick's formula, see, for example, [Zv97]. Let w\,... ,wi be 
Gaussian random variables such that 

Ewi = . . . = F1W1 =0. 

Then 

E w\ ■ ■ ■ wi = if I = 2r + 1 is odd and 

Ew 1 ■ ■ -wi = (Ew^Wjz) ■ ■ ■ (E^^j^J if / = 2r is even 



and the sum is taken over all (2r)!/r!2 r unordered partitions of the set of indices 
{1, . . . , 2r} into r pairwise disjoint unordered pairs , • • • , {i2r-i, ^2r}- Such 

a partition is called a matching of the random variables wi,... , wi. We say that 
and Wj are matched if they form a pair in the matching. 
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In particular, 

(4.2.1) E^=<gl(E«,y. 
We will also use that 

(4.2.2) E (w\wl) = 9 (E wj) (E w 2 2 ) (E i^wa) + 6 (E 

and later in Section 5 that 

cov (wi,w±) =E {wfw 4 2 ) - {Ewf) {Ew 4 2 ) 

=9 (Ewf) 2 {Ewlf + 72 (Ewiw 2 ) 2 (Ewf) (Ew|) 
4 ' 2 ' 3 +24(E«;x«; 2 ) 4 -9(E«;f) 2 {Ewlf 

=72 (Ewiw 2 ) 2 (Ewf) (Ew|) + 24 (E Wl w 2 ) 4 

All implied constants in the "O" notation in this section are absolute. 

(4.3) Auxiliary random variables vj k . For the Gaussian random variables 
{ujk} of Theorem 4.1, let us define Gaussian random variables {vj k }, where j = 
1, . . . , m and k = 1, . . . , n such that 

E = for all j, /c and 

E (v jlkl v jak2 ) = E {u 3 jlkl u 3 hk2 ) for all ji, j 2 and fci,fc 2 . 

We say that the random variables Uj lkl and Uj 2 f~ 2 in Theorem 4.1 are weakly cor- 
related if ji 7^ j2 and k\ ^ fc 2 and that they are strongly correlated otherwise. 
Similarly, we say that Vj lkl and Vj 2 k 2 are weakly correlated if j± ^ j 2 and k± ^ k 2 
and are strongly correlated otherwise. 
By (4.2.2), 

Eu hk 1 u nk 2 = Ev jlkl v jak2 
9 3 



, A a-\\ I 0[- ■ — —r if u jlkl ,u hk2 are weakly correlated 

(4.0.1J j yim + npj 

( 6 3 \ 

O I I if Uj lkl ,Uj 2k2 are strongly correlated. 

\{m + n) 6 J 

Since the number of weakly correlated pairs is O (m 2 n 2 ) while the number of 
strongly correlated pairs is O (m 2 n + n 2 m), we obtain that 

(4.3.2) EV 2 = EU 2 = O (6 3 ) . 
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(4.4) Representation of monomials by graphs. Let Xj k , j = 1, • • • , m, k = 

1, . . . , n, be formal commuting variables. We interpret a monomial in Xj k combina- 
torially, as a weighted graph. Let K m ^ n be the complete bipartite graph with m + n 
vertices and ran edges (j, k) for j = 1, . . . , m and k = 1, . . . , n. A weighted graph 
G is a set of edges (j, k) of K m ^ n with positive integer weights ctjk on it. With G, 
we associate a monomial 

(j,fc)€G 



t G (x) = J] ..«'' 



The weight XleeG ae °^ ^ ^ s degree of the monomial. We observe that for any 
p there are not more than r°^ r \m + n) p distinct weighted graphs of weight 2r and 
p vertices. We note that pairs of variables Uj 1 k 1 and Uj 2 k 2 and pairs of variables 
Vfak-L and Vj 2 k 2 corresponding to the edges k\) and (j'2, k-^) in different connected 
components are weakly correlated. 

(4.5) Lemma. Given a graph G of weight 2r, r > 1, let us represent it as a 
vertex- disjoint union 

G = G UG 1 , 

where Go consists of s isolated edges of weight 1 each and G\ is a graph with no 
isolated edges of weight 1 (we may have s = and no Go). 
Then 



(1) We have 



r O(r)Q3r 

\ Et G(u jk )\ < (m ~ n)3r+s/2 and 



\Et G (v jk )\ < 
Additionally, if s is odd, then 



r O(r)03r 



(m + n) 3r+s / 2 ' 



r O(r)Q3r 

\Et G {u%)\, \Et G (v jk )\ < (m + n)3r+(s+1)/2 - 

(2) If s is even and G\ has r — s/2 connected components, each consisting of a 
pair of edges of weight 1 each sharing one common vertex, then the number 
of vertices of G is 

3r + f. 

In all other cases, G has strictly fewer than 3r + s/2 vertices. 

(3) Suppose that s is even and that G\ has r — s/2 connected components, each 
consisting of a pair of edges of weight 1 each sharing one common vertex. 
Then 

\Vt G (u%)-Et G (v jk )\ < (w+ r n)3r+g/2+1 . 
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Proof. To prove Part (1), we use Wick's formula of Section 4.2. Since for each 
isolated edge k\) G Gq, at least one of the three copies of the random variable 
Uj 1 k 1 has to be matched with a copy of the variable Uj 2 t~ 2 indexed by an edge (j'2, ^2) 
in a different connected component, we conclude that each matching of the multiset 

(4.5.1) l^Ujk.Ujk.Ujk : (j,k) G cj 

contains at least s/2 weakly correlated pairs of variables and hence 

/ a \ s / 2 / a \ 3r ~ s / 2 
\Vt G (u%)\ < r°^' '' ' 



m + n) z / \m + n 

Moreover, if s is odd, the number of weakly correlated pairs in every matching is 
at least (s + l)/2 and hence 

(s+l)/2 / o \ 3r-(s+l)/2 



|Et G {u%)\ < r°W ( ^—^] S ( -?—) 
1 K Jkn ~ \(m + n) 2 J \m + nj 



Similarly, since for each isolated edge (ji, k\) G Gq, variable has to be matched 
with a variable Vj 2 k 2 indexed by an edge (j'2, £2) m a different connected component, 
we conclude that each matching of the set 

(4.5.2) {v jk : (j,fc)GG} 

contains at least s/2 weakly correlated pairs of variables and hence by (4.3.1) 

03 \ s / 2 / ^3 \ r-s/2 



\Et G (v jk )\ < r 



O(r) 



(m + n) 4 y \(m + n) 3 



Moreover, if s is odd, the number of weakly correlated pairs in every matching is 
at least (s + l)/2 and hence 

/ n3 \ («+l)/2 / n3 \r-(s+l)/2 

\Vt G (v jk )\ < r ^(—^) (-f-^) 

^(m + n) 4 / y^m + n) 6 / 

which concludes the proof of Part (1). 

To prove Part (2), we note that a connected weighted graph G of weight e 
contains a spanning tree with at most e edges and hence has at most e + 1 vertices. 
In particular, a connected graph G of weight e contains fewer than 3e/2 vertices 
unless G is an isolated edge of weight 1 or a pair of edges of weight 1 each, sharing 
one common vertex. Therefore, G has at most 



2s+^(2r- S ) = 3r + | 
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vertices and strictly fewer vertices, unless s is even and the connected components 
of Gi are pairs of edges of weight 1 each sharing one common vertex. 

To prove Part (3), let us define T, U (G) as the sum in the Wick's formula over all 
matchings of the multiset (4.5.1) of the following structure: we split the edges of 
G into r pairs, pairing each isolated edge with another isolated edge and pairing 
each edge in a 2-edge connected component of G with the remaining edge in the 
same connected component. We then match every variable of the multiset (4.5.1) 
with a variable indexed by an edge in the same pair. Reasoning as in Part (1), we 
conclude that 

\Et a (u%)-^ ( G)\ < (m ;° n ( jlC/2 +1 - 

Similarly, let us define T, V (G) as the sum in the Wick's formula over all matchings 
of the set (4.5.2) of the following structure: we split the edges of G into r pairs as 
above and match every variable in the set (4.5.2) with the variable indexed by the 
remaining edge of the pair. Then 

\Et G (v jk )-E v (G)\ < {m l n)3r+s/2+l - 

Since 

E U (G) = E V (G), 

the proof of Part (3) follows. □ 

(4.6) Lemma. Let Ujk be random variables as in Theorem 4-1 and let vjk be the 
auxiliary Gaussian random variables as in Section 4-3. Let 

U = ^ u % and V = ^ v jk- 

l<j<m 

l<k<n l<k<n 

Then for every integer r > 1 we have 



\EU 2r -EV 2r \ < 



r O(r)Q3r 

m + n 



Proof. We can write 

EU 2r =^a G Efe( U y and 

G 

Ey2r =5> G Et G (^- fc ), 

G 

where the sum is taken over all weighted graphs G of the total weight 2r and 



1 < a G < (2r)L 
26 



Let Q2r be the set of weighted graphs G whose connected components consist of 
an even number s of isolated edges and r — s/2 pairs of edges of weight 1 sharing 
one common vertex. Since there are not more than r°^ r '(m + n) p distinct weighted 
graphs with p vertices, by Parts (1) and (2) of Lemma 4.5, we conclude that 



EU 2r - a G Et G (u%) 



GEG2r 



E V 2r — a G Et G (v jk ) 



r O(r)Q3r 

< and 

m + n 

r O(r)Q3r 



< 



m + n 



The proof now follows by Parts (1) and (3) of Lemma 4.5. 



□ 



(4.7) Proof of Theorem 4.1. Let U and V be the random variables as in Lemma 
4.6. We use the standard estimate 



2r-l 

E 

s=0 



(ix) 1 



si 



< 



X 



2r 



(2r)! 



for i6R, 



from which it follows that 



(4.7.1) 
By (4.3.2), we have 
and hence 



Ee lU -Ee 



iV\ 



< 



EU Ir EV 
+ 



2r 2r-l 



(2r)! (2r)! 



E U s - E V* 

7\ 



s=0 



EU 2 = EV 2 = o (e 3 ) 



EU 2r , EV 2r < ^2°^6 3r . 
- 2 r r\ 



Therefore, one can choose an integer r such that 



so that 



rlnr < 7i(0)ln- for some constant 'ji(O) > 



EU 2r + E V 2r e 
(27)! ~ 2' 



By Lemma 4.6, as long as 
m + n > 

we have 



EU 



2k 



EV 2k \<- for k = 0,1,... ,r. 
1 6 
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for some constant 7(6*) > 



We note that by symmetry 



EV S = EU S = if s is odd. 
Since V is Gaussian, we have 



Ee lV = exp|-^Ey 2 

and the proof follows by (4.7.1). □ 

5. The fourth degree term 

In this section we prove the following main result. 

(5.1) Theorem. Let Wjk, j = 1,... ,m and k = 1,... ,n be Gaussian random 
variables such that 

Ewjk = for all j, k. 
Suppose further that for some 9 > we have 

2 

^ w nk ^ for all j,k and 

J m + n 

Q 

\Ew jlkl w j2k2 \ < ■ — -2 provided ji^ji and h ^ k 2 . 



Let 



W= 



jk- 

l<j<m 
l<k<n 

Then for some absolute constant 7 > we have 
(1) 

EW < 7# 2 ; 

(2) 

varW < 



m + n 
(3) 

P<{V>7# 2 + l} < exp|-(m + ?i) 1 / 5 | 

provided m + n > 71(6*) for some 71(6*) > 0. 

We will apply Theorem 5.1 in the following situation. Let q : M m + n — y M 
be the quadratic form defined by (1.2.1) and let L C R m+n be a hyperplane not 
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containing the null-space of q. Let us fix the Gaussian probability measure in L 
with the density proportional to e~ q . We define random variables 



4 

w jk = 



Q k (Q k + 1) + QQk + i) 
\ ^4 (** + tk) ' 



where s\, . . . , s m ; t±, . . . ,t n are the coordinates of a point in L. Then we have 

w = E «& = MM) 

l<A:<m 
l<j<n 

for /i defined by (1.2.2). 

While the proof of Parts (1) and (2) is a straightforward computation, to prove 
Part (3) we need reverse Holder inequalities for polynomials with respect to the 
Gaussian measure. 

(5.2) Lemma. Let p be a polynomial of degree d in random Gaussian variables 
wx, . . . , Wl. 

Then for r > 2 we have 

(E\p\ r ) 1/r < r d / 2 (Ep 2 ) 1/2 . 
Proof. This is Corollary 5 of [Du87]. 



□ 



(5.3) Proof of Theorem 5.1. Using formula (4.2.1), we get 

E«4 = 3(E«4) 2 < . ^ . 2 

and hence 

o/i2 

EW= Y EwL < (mn) , - < 39 2 

l<j<m V ' 

l<k<n 

which proves Part (1). 

To prove Part (2), we note that 



varW = cov Wi fci > w Ui ) 



l<il,i2<m 
\<k\ ,k2<n 



Using (4.2.3), we get 

cov Wifci>«4fc 3 ) = 72(Ew jlkl w jaka ) 2 (Ew| lfci ) (Ew 2 2k2 ) +2A(Ew jlkl w j2k2 Y 
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Hence 



gg^4 

COV Mfci' W W ^ (m + n) 4 fora11 ^ and ^1^2- 



Additionally, 



/ 4 4 x . 72^ 4 , 240 4 



(m + n) 6 (m + n) 8 
966> 4 

< t ; — provided ji ^ j 2 and h ^ k 2 . 

(m + n) b 



Summarizing, 



tj j- 2 2 966> 4 , 2 2X 960 4 ^ 7 6> 4 
var W < m n + [mn + nm J < 



(m + n) 6 (m + n) 4 m + n' 

which proves Part (2). 

To prove Part (3), we apply Lemma 5.2 with d = 4 to p(W) = W — E W. From 
Part (2), we get 

E|W-EW| r < r 2r (varV^) r/2 < r 2r f -^^) 7 . 

\m + n J 

Let us choose 

r = ( m + n ) 1 /5 

for sufficiently large m + n. Then 

r/2 



„2r 



(^^) = exp {2r lnr + T - In ( 7 # 4 ) - £ ln(m + n) } 



- exp 



< exp 



~(m + n) 1/5 ln(m + n) + — - (m + n) 1/5 

J. L/ — 

(m + n) 1 / 5 



fO 



provided ln(m + n) > 5 In (7# 4 ) + 10 

Hence if m + n is sufficiently large, we have 



E\W-EW\ r < exp 



(m + n) 



1/5 



10 



The proof follows by Part (1) and Markov's inequality. □ 
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6. Computing the integral over a neighborhood of the origin 
We consider the integral 

F(s,t) dsdt 



'no 

of Corollary 2.2. Recall that n is the facet of the parallelepiped 

IT = |(si, . . . , s m ;£i, . . . ,t n ) : -tv < Sj, t k < % for all j, /cj 
defined by the equation t n = and that 

{m n j ^ 

-*E^-*E c ^ II i + c 7fc -ae^^) - 
j = l k=l I l<j<m ^ Sjfc Sj/C 

' l<fc<n 

In this section, we prove the following main result. 

(6.1) Theorem. Let us fix a number < 5 < 1. Suppose that m > 5n, n > dm 
and that 

<5r < Qjk < r for all j, A; 

and some r > 5. 

Let q : ]R m + n — y K. 6e £/ie quadratic form defined by (1.2.1) and 
let /, /i : M m+n — > M be the polynomials defined by (1.2.2). Let us define a neigh- 
borhood U C n by 

[ ln( m ~\~ n ) 

U = I (s,t) e n : \sj\,\t k \ < \ ' for all j,k 

Let us identify the hyperplane t n = containing n with R m+n_1 , let 

e~ q dsdt 

m + n. — 1 

and let us consider the Gaussian probability measure in K m + n_1 with the density 
»r 1 e- q . Let 

H = Ef 2 and u = Eh. 

Then 
(1) 

(27r) (m+n " 1)/2 



S > 



m (n-l)/2 n (m-l)/2( T + r 2)(m+n-l)/2 ' 

(2) 

< fi, v < 7(5) 
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for some constant 7(0") > 0. 
(3) For any0<e< 1/2 



/ F(s,t) dsdt - exp i~ + u\ E 
Ju I 2 J 



< eS 



provided 



m + n > I - 



7(5) 



for some constant 7(0") > 0. 
(4) For any0<e< 1/2 



/ \F(s, t) I dsdt — exp{z/}S 



< eS 



provided 



m + n > 



7 («5) 



/or some constant 7(5) > 0. 

Proof. All implied constants in the "O" and "O" notations below may depend only 
on the parameter 5. 
Let 

)(*»*)= 9 5^ (sj+^) 2 



9ol 



l<j'<m 
Kfe<n 



as in Lemma 3.6. Then 



< (T + T 2 )q (s,t) 



and, therefore, 

/ exp{— q(s, t)} dsdt > / exp { — (r + r 2 )qo(s, t)} o!so!t 



7T 



(m+n-l)/2 



( T 2 + r ) (m+n- 1)/2 g Q |Rm+n-l ' 

where (/ |IR m+n ~ 1 is the restriction of the form g onto the coordinate hyperplane 
t n = in 

Let -ff C R m + n be the orthogonal complement complement in M m + n to the 
kernel of qo , that is, the hyperplane defined by the equation: 



si + . . .s m = ti + . . . + t n for (s,t) = (si,... ,s m ; ti,... ,t m ) . 

32 



Then, from the eigenspace decomposition of Lemma 3.6 it follows that 



det q \H = i^— 



/m\ n ~ 1 i"n\ rn ~ 1 I m + n 



On the other hand, by Lemma 3.5, 



det qo | 



1 



m + n 



det qo\H 



and the proof of Part (1) follows. 

Let us consider the coordinate functions Sj,tf- as random variables on the space 
j^m+n-i w j^j 1 fag Gaussian probability measure with the density E~ 1 e~ q . From 
Theorem 3.2, 



(6.1.1) 



e ( Sj + t k ) = o 

and 



r 2 (m + n) 



for all j, k 



IE (s h +t kl )(s j2 +t k2 )\=o(^^j 
if ji ^ 32 and k x ^ k 2 



Let 



Ujk 



a/Cife (Cjk + 1) (2Q k + 1) 



and 



(Sj+tk) 



1 



Cjk (Ofc + i) (ecj'fc + eOfc + i 
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(sj + tfc) for all j, fc. 



Then Ujk and are Gaussian random variables such that 

1 



Eu%, Ew z jk =0 



m + n 



for all j, k 



(6.1.2) 



and 



\Eu jlkl u n k 2 \ , \Ew jlkl w h k 2 \ =0 



mn 



if h 7^ h and k 2 . 



We observe that 



/= J2 u % and fe = I] 



l<j<m 
Kk<n 



l<j<m 
Kk<n 
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Therefore, the bound 

^ = E/ 2 = 0(1) 
follows by Theorem 4.1 while the bound 

v = Bh = 0(1) 

follows by Part (1) of Theorem 5.1. This concludes the proof of Part (2). 

Since Sj + tk is a Gaussian random variable, by the first inequality of (6.1.1), we 
conclude that 

ln(m + n) l < exp{-0 (In 2 (m + n))\. 
Note that the inequalities hold for k = n with t n = as well, and hence 
(6.1.3) / e~ q dsdt < exp{-0 (ln 2 (m + n)) )s, 

where U is the neighborhood defined in the theorem. Therefore, 



P { \Bj + tk\ > 



7TJ. + 7T, — 1 



\U 



exp{ — q — if} dsdt 



< expj-fi (ln 2 (m + n))}s. 



Since / = Yljk u % an< ^ Gaussian random variables Ujk satisfy (6.1.2), from Theorem 
4.1 we conclude that for any < e < 1/2 we have 



/ exp{-q - if} dsdt - exp {-^} 



< eS 



provided m + n > 



O(l) 



Therefore, for any < e < 1/2 we have 

J exp{-q-if} dsdt- exp {-|} 



(6.1.4) 



< eS 



provided m + n > I - 



O(i) 



Since /i = J2jk w jk an d Gaussian random variables Wjk satisfy (6.1.2), by Part 
(2) of Theorem 5.1, we have 



var/i = E(h- u) 2 = O 
34 



1 

m + n 



Applying Chebyshev's inequality, we conclude that for any e > 

1 



(6.1.5) 



/ 



e~ q dsdt = O 



(s,t)EU: 
\h(s,t)-u\>e 



e 2 (m + n) 



By Part (3) of Theorem 5.1, for some constant 7(5) > we have 
(6.1.6) J e~ q dsdt = o(exp{-(m + n) 1/5 }^Z. 



(s,t)EU: 
h(s,t)>j(S) 



In addition, 
(6.1.7) 



h = O (ln 4 (m + n)) in U. 



In view of (6.1.3) and Part (2) of the theorem, (6.1.5)-(6.1.7) imply for any < 
e < 1/2 we have 



(6.1.8) 



/ exp{— q + h} dsdt — exp {is} 
Ju 



< eS 



provided m + n > 



O(l) 



Similarly, from (6.1.4)-(6.1.7) we deduce that 

/ exp{— q — if + h} dsdt — exp { — 7^ + ^} 
Ju I 2 J 



(6.1.9) 



< eS 



provided m + n > 



O(l) 



From the Taylor series expansion, cf. (2.2.1), we write 

F(s, t) = exp{— q(s, t) — if(s, t) + h(s, t) + p(s, t) } where 

ln 5 (m + n ) 



\p(s,t)\ = 



for (s,£) e U. 



y/m + n 

Therefore, using (6.1.8) and Part (2) of the theorem we conclude that 

/ F dsdt - [ exp {—q - if + h} dsdt = O f ln ( m + n ) \ s 
Ju Ju V ^Jm + ri J 

and 

/ \F\dsdt- [ ex.p{-q + h} dsdt = O ( M^±^A s _ 
Ju Ju \ Vm + n J 

We complete the proof of Parts (3) and (4) from (6.1.8) and (6.1.9). 
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□ 



7. Bounding the integral outside of a neighborhood of the origin 

We consider the integral representation of Corollary 2.2. In this section we 
prove that the integral of F(s,t) outside of the neighborhood U of the origin is 
asymptotically negligible (note that by Theorem 6.1 the integral of F and the 
integral of \F\ over U have the same order). We prove the following main result. 

(7.1) Theorem. Let us fix a number < 5 < 1. Suppose that m > 5n, n > dm 
and that 

8t < Qjk < t for all j, k 
and some 5 < r < (m + n) 1 ^ 5 . Let 

W=((M)en : \sj\,\t k \ < ln( ^l^ for all j,k) . 
[ Ty/m + n J 

Then for any k > 

/ \F(s,t)\ dsdt < (m + n)- K \F(s,t)\ dsdt 
Ju \u Jn 

provided m + n > 7(5, k) for some 7(5, k) > 0. 

We prove Theorem 7.1 it in two steps. First, by a string of combinatorial argu- 
ments we show that the integral 



dsdt 



[ \F(s,t)\ 
Ju \i 

is negligible compared to 

(7.1.1) J \F(s,t)\ dsdt, 

where I C Ilo is a larger neighborhood of the origin, 

7=|(s,t)en : \sj\,\t k \ < e/r for all j,k} 

and e > is any fixed number. This is the only place where we use that r is 
bounded above by a polynomial in m+n. Then we notice that for a sufficiently small 
e = e(S), the function \F(s, t) \ is strictly log-concave on I and we use a concentration 
inequality for strictly log-concave measures to deduce that the integral 



'I\U 

is negligible compared to (7.1.1). 



\F(s,t) \ dsdt 
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(7.2) Metric p. Let us introduce the following function 

p : R — > [0, 7r], = min |x — 27rfc|. 

£ ^ 

In words: is the distance from x to the closest integer multiple of 2n. Clearly, 

p(-x) = p(x) and p(x + y) < p(x) + p(y) 
for all x, y e K. We will use that 

(7.2.1) l-^p 2 (x) < cosx < l-^p 2 (x). 

2 5 

(7.3) The absolute value of F(s,t). Let 

OLjk = Kjk (1 + Cjk) for all j, fc. 

Then 

(7.3.1) 25 2 t 2 < a jk < 4r 2 for all j,k 
Let us define functions 

fjk(x) = — = for iGl. 

y/1 + ajk — ctjk cos x 

Then we can write 

(7.3.2) \F(s,t)\= H f jk ( Sj + t k ). 

l<j<m 
l<k<n 

We observe that 

f jk (x) = l provided p(x) = and that fj k (-x) = f jk (x). 
From (7.2.1) and (7.3.1) we conclude that for any e > we have 

fjk(x) < exp{-7(<J,e)}/j fc (y) 

(7.3.3) 2e e 

provided p(x) > — and p(y) < — , 

r r 

where 7(5, e) > is a constant. 
Finally, we observe that 

— In f (x) = ®jk(l + Ujk)cosx-a 2 k 
dx 2 J 2(1 + ajk — ctjk cos x) 2 
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It follows from (7.2.1) and (7.3.1) that for all j, k 

d 2 2 5 2 

(7.3.4) —^\iif jk (x) < ~-r 2 5 2 provided |x| < — . 

ClX O OT 

In particular, the function hafj k {x) is strictly log-concave on the interval \x\ < 
5 2 /5t. 

In what follows, we fix a particular parameter e > 0. All implied constants in 
the "O" and "O" notation may depend only on the parameters 5 and e. We say 
that m and n are sufficiently large if m + n > 7 (5, e) for some constant 7(5, e) > 0. 
Recall that m and n are of the same order, m > Sn an n > 8m. 

Our first goal is to show that for any fixed e > only the points (s,t) G ITq 
for which the inequality p (sj +tf~) < e/r holds for the the overwhelming majority 
of pairs of indices (j,k) contribute significantly to the integral of \F(s,t)\ on IIo. 
Recall that r n = on n . 

(7.4) Lemma. For e > and a point (s, t) e IIo let us define the following two 
sets: 

Let J = J(s, t; e) C {1, . . . , m} be the set of all indices j such that 

P (sj + t k ) < e/r 

for more than (n — l)/2 distinct indices k = 1, . . . , n — 1 
and 

Let K = K(s, t; e) C {1, . . . , n — 1} be the set of all indices k for such that 

P (sj + t k ) < e/r 

for more than m/2 distinct indices j = 1, . . . , m. 

Let J = {1, . . . , m} \ J and let K = {1, . . . , n - 1} \ K. 
Then 

(1) 

\F(s,t)\ < exp {— 7(5, e)n\ J|} and 
\F(s,t)\ < exp{-7(5,e)m|Z|} 

for some constant 7(5, e) > 0. 

(2) 

P - s j 2 ) < 2e / r for all ji,j2 £ J and 
p(t kl -t k2 ) < 2e/r for all k 1 ,k 2 eK. 

(3) // I J| > m/2 or |if | > (n - l)/2 t/ien 

p(sj+tk) < 3e/r for all j G J and all k E K. 
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Proof. For every j E J there are at least (n — l)/2 distinct k and for every k E K 
there are at least m/2 distinct j such that 

P(>j +£fc) > e/r 

and hence 

fjk(sj + t k ) < exp{-0(l)} 

by (7.3.3). Part (1) follows from (7.3.2). 

For every ji,j2 E J there is at least one common index k such that 

P (sj t + t k ) < e/r and p (s h + t k ) < e/r. 

Then 

P( s ji - s h) = P( s ji +tk ~ s j2 -t k ) < p(s jl +t k ) + p(s h +t k ) < 2e/r. 

The second inequality of Part (2) follows similarly. 

If | If | > (n — l)/2 then for every j E J there is a kj E K such that 

p {sj +t kj ) < e/r. 

Then, by Part (2), for every k E K we have 

p(sj +t k ) = p{sj +t kj -t kj +t k ) < p (sj +t kj ) +p (t k -t kj ) < 3e/r. 

The case of | J\ > m/2 is handled similarly. □ 

Using estimates of Theorem 6.1, it is not hard to deduce from Part (1) of Lemma 
7.4 that for any fixed e > only points (s, t) E Hq with J = O(lnm) and K — 
O(lnn) may contribute essentially to the integral of \F(s, t)\. It follows then by Part 
(3) of Lemma 7.4 that for such points we have p (sj + t k ) < 3e/r for all j E J and 
all k E K. Our next goal is to show that only those points (s,t) E Hq contribute 
substantially to the integral for which p (sj + t k ) < e/r for all pairs (j, k). 

(7.5) Proposition. For e > let us define a set X(e) C n by 

X(e) = {(M)en : p ( Sj + t k ) < e/r 

for all j = l,...,m and k=l,...,n — lj. 

Then 

/ \F(s,t)\ dsdt < exp{-7(5,e)(m + n)} / \F(s,t)\ dsdt 

Jn \x(e) " Jlio 
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for some constant 7(5, e) > and all sufficiently large m + n. 

Proof. For subsets A C {1, . . . , m} and B C {1, . . . , n — 1} let us define a set 
-Pa,b C n (we call it a piece) by 

Pa,b = j(s, t) G n : p (sj + tfc) < e/40r for all j G i and all /c G 



Let 



where the union is taken over all subsets A and £? such that 

\~A\ < In 2 m and S < In 2 n, 

where 

A = {1,... ,m}\A and B = {1, . . . , n - 1} \ B. 

We claim that the integral over ILj \ V is asymptotically negligible. Indeed, for 
sufficiently large m + n and for all (s, t) e II \ V, by Part (3) of Lemma 7.4, we 
must have 

J(s, t;e/120) > In 2 m or Z(s,t;e/120) > In 2 n. 
In either case, by Part (1) of Lemma 7.4, we must have 

\F(s,t)\ < exp{-0(nln 2 n)} 

for all sufficiently large m + n. On the other hand, by Parts (1), (2) and (4) of 
Theorem 6.1, we conclude that 



/ 



\F(s,t)\ dsdt > exp{-0(nlnn)} 



(we use that r is bounded by a polynomial inm + n). This proves that 

(7.5.1) / \F(s,t)\ dsdt < exp{-0 (nln 2 ?i)} / \F(s,t)\ dsdt 

Jn \v 1 Jiio 

provided m + n is sufficiently large. 

Next, we prove that the integral over V\X(e) is asymptotically negligible. 

As in the proof of Part (2) of Lemma 7.4, we conclude that for every piece Pa,b 
and for every (s,t) G Pa,b we have 

p(sj 1 — Sj 2 ) < e/20r for all 31,32 G A and 

P (*fci - tk 2 ) < e/20r for all ki, k 2 G B. 
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Let us choose a point (s, t) G Pa,b \ ^( e )- Hence we have p (sj + t ko ) > e/r for 
some jo and ko. Let us pick any j\ G A and fci G -B. Then 

P ( s Jo + *fco) =P ( S Jo + tko + Sji + *fci - Sji - t kl ) 

< P (Sjo +tk t ) + P (S n + t ko ) + p (s h +t kl ). 

Since p (sj 1 + t kl ) < e/40r, we must have either 

P(s j0 + t kl ) > 39e/80r, 
in which case necessarily jo £ A, or 

P(sji+*fco) > 39e/80r, 

in which case necessarily ko G B. 
In the first case (7.5.2) implies that 

P (sjo + h) > 35e/80r = 7e/16r for all k G B 

and in the second case (7.5.2) implies that 

P ( s j + tk ) > 35e/80r = 7e/16r for all j G A. 

For j 6 A we define 

(7.5.3) Q A>B;i = |(s,£) G P A) s : P (sj + t k ) > 7e/16r for all fc G s} 
and for A; G -B we define 

(7.5.4) R A ,B; k = {(s,t) e P a ,b : p{s 3 +t k )>7e/lQr for all j G a}. 
Then 

(7.5.5) PA,i»\X(e) C IIJQa,^ (J Ui2A,B;» 

Let us compare the integrals 

/ \F(s,t)\ dsdt and / \F(s,t)\ dsdt. 

JQA,B;j J Pa,B 

Given a point (s, £) G Pa,b we obtain another point in Pa,b if we arbitrarily choose 
coordinates Sj G [— 7r, it] for j E A and G [— 7r, 7r] for k E B. Let us pick a 
particular non-empty set Qa,B;j for some jo G A. We obtain a /i&er -E 1 = -E^ C 
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Pa,b if we let the coordinate Sj vary arbitrarily between — it and tt while fixing 
all other coordinates of some point (s, t) G Pa,b- Geometrically, each fiber E is an 
interval of length 2tt. We construct a set I G E as follows: we choose an arbitrary 
coordinate k\ G B and let Sj vary in such a way that p (s JO < e/20r. 

Geometrically, / is an interval of length e/lOr or a union of two non-overlapping 
intervals of the total length e/lOr. Moreover, by (7.5.2), we have 

(7.5.6) p(s jo +t k ) < e/10r for all k G B and all (s,£) G J. 

As we vary Sj without changing other coordinates, in the product (7.3.2) only 
the functions fj k change. Comparing (7.5.6) and (7.5.3) and using (7.3.2) and 
(7.3.3), we conclude that 

\F(s,t)\ < exp{-0(n)} \F (s,i)\ 
for all (s,t) G Qa,b j n E and a11 £ ^> 



Therefore, 



EnQ A , B 



\F(s,t)\ds jo < exp{-0(n)} / |F(s,t)| ds jo 

J E 



provided m + n is large enough (again, we use that r is bounded by a polynomial 
inm + n). Integrating over all fibers E C Pa,b, we prove that 

/ \F(s,t)\ dsdt < exp{-0(n)} / \F(s,t)\ dsdt 

JQA,B;i J Pa,B 

provided m + n is large enough. Similarly, we prove that for sets RA,B-k defined by 
(7.5.4) we have 



/ \F(s,t)\dsdt < exp{-0(n)} / \F(s, 

J Ra R -fc J Pa r 



t)\ dsdt 



RA,B;k J Pa,B 

provided m + n is large enough. Since \A\ < In 2 m and B\ < In 2 n, from (7.5.5) we 
deduce that 



/ \F(s,t)\dsdt < exp{-0(n)} / \F(s,t)\ 

JP a ,b\X{€) Jp a ,b 



dsdt 



provided m + n is large enough. Finally, since the number of pieces Pa,b does not 
exceed exp {O (In 3 n) }, the proof follows by (7.5.1). □ 

Our next goal is to show that the integral over n \ / is negligible, where 
J = j(s,t) G n : |sj|,|tfc| < e/r for all 

We accomplish this in the next two lemmas. 
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(7.6) Lemma. For e > let us define a set Y{e) C n by 

Y(e) = | (s, t) G IIo : \sj + t k \<e/r 

for all j = l,...,m and all k = l,...,n — lj. 

T/ien 

/ \F(s,t)\ dsdt < exp{-7(5,e)(m + ?i)} / \F(s, t)\ dsdt, 

Jn \Y(e) Ju 

for some constant 7(e, S) > and all sufficiently large m + n. 

Proof. Without loss of generality, we assume that e < 5/2, so e/r < 1/2. 
Let X(e) be the set of Proposition 7.5 and let us define 

Z(e) = |(s,t) G n : s j7 t fc G [-7T, -tt + e/r] U [tt - e/r, tt] 

for all j = 1, . . . , m and all k = 1, . . . , n — 1 j. 

We claim that 

(7.6.1) X(e) C y(e) U Z(2e). 

We note that if p(x) < e/r for some — 2n < x < 2n then either \x\ < e/r 
or x > 2rr — e/r or x < —2tv + e/r. To prove (7.6.1), let us pick an arbitrary 
(s,t) G X(e). Suppose that 

(7.6.2) — 7r + 2e/r < Sj < n — 2e/r for some jo. 
Since — it <tk<ir for all fc, we have 

-27r + 2e/r < s jo + t k < 2n-2e/r for k = l,...,n. 
Since p (sj + tk) < e/r, we must have 

\ s j +tk\ < e/r for k=l,...,n — 1 

and, therefore, 

— tv + e/r < tk < tv — e/r for fe = l,...,n — 1. 
Since — 7r < Sj < 7r we conclude that 

- 27T + e/r < + t k < 2tv - e/r 

for all j = 1, . . . , m and k = 1, . . . , n — 1. 
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Since p (sj + £/-) < e/r we conclude that 

\sj +t k \ < e/r 

for all j = 1, . . . , m and k = 1, . . . , n — 1 

and hence (s,t) G l^(e). 

Similarly, we prove that if 

(7.6.3) -7r + 2e/r < t fco < 2tt - 2e/r for some /c 

then (s,t) G Y(e). If both (7.6.2) and (7.6.3) are violated, then (s,t) G Z(2e) and 
so we obtain (7.6.1). 

Next, we show that the integral over Z(2e) is asymptotically negligible. The 
set Z(2e) is a union of 2 m+n_1 pairwise disjoint corners, where each corner is 
determined by a choice of the interval [— n, — it + 2e/r] or [ir — 2e/r, 7r] for each 
coordinate and tfc. The transformation 

( sj + tt if Sj G [-7T, -7T + 2e/r] . 
Sj 1 — >■ < ., r _ , , for j = 1,... ,m 

[ Sj — 7T it Sj G [7T — 2e/r, 7Tj 

and 

f t fc + 7T if t fc G [-7T, -7r + 2e/r] 
tfc i — >■ < tor A; = 1, . . . , n — 1 

t tk - 7r if tfc G [it - 2e/r, 7r] 

is measure-preserving and maps Z(2e) onto the cube 

7={(s,t): \t k \ < 2e/r for all 

In the product (7.3.2), it does not change the value of fjk except when k = n (recall 
that t n = on IIo). Since 2e/r < 1, by (7.3.3) the transformation increases the 
value of each function fj n by at least a factor of 7(5) > 1. Therefore, 

/ \F(s,t)\ dsdt < exp{-0(m)} / \F(s,t)\ dsdt 

JZ(2e) Jl 

and the proof follows by (7.6.1) and Proposition 7.5. □ 
(7.7) Lemma. For e > let us define the cube 

7(e) = |(s,t) G n : \sj\, \t k \ < e/r for all j,k}. 

Then 

/ \F(s,t)\ dsdt < exp{-7(5,e)(m + n)} / \F(s,t)\ dsdt 
Jn \i(e) Jn 
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for some 7(5, e) > and m + n large enough. 

Proof. Without loss of generality, we assume that e < 5, so e/r < 1. 

Let Y(e/20) be the set of Lemma 7.6, so the integral of \F(s,t)\ over 
IT \ F(e/20) is asymptotically negligible. 

Let us choose a point (s, t) G F(e/20). We have 

/e/20r < si < (/ + l)e/20r for some integer /. 
Since \s± + tk\ < e/20, we obtain 

(-/-2)e/20r < t k < (-1 + l)e/20r for fe = l,...,n-l 
and then similarly 

(/ - 2)e/20r < s, < (I + 3)e/20r for j = 1, . . . , m. 
Let us denote 

/ \ 



w = 



e/20r, . . . , e/20r; -e/20r, . . . , -e/20r, 



■v- 



y m times n— 1 times / 

Hence we conclude that 

(7.7.1) y(e/20) C |J 7(3e/20) + 

|Z| <l+207rr/e 



Since r is bounded by a polynomial in m and n, the number of translates of the 
cube J(3e/20) in the right hand side of (7.7.1) is (m + n)°^ l \ 
The translation 

(s, t) i — > (s, t) + Iw 

does not change the value of the functions fjk($j + tk) in (7.3.2), unless k = n 
(recall that t n = on n ). 

For (s,t) E 7(3e/20) we have \sj\ < 3e/20r for all j. For (s,t) G 7(3e/20) + 
with |/| > 10, we have \sj\ > 7e/20r for all j. Since e/r < 1, for all / in the union 
of (7.7.1) such that |/| > 10 and all (s,t) G 7(3e/20) + Iw we have p(sj) > 6e/20r 
for all j = 1, . . . , m. 

Using (7.3.2) and (7.3.3) we conclude that 

\F(s,t)\ < exp{-Cl{m)}\F(s,i)\ for all (§, t) G 7(3e/20) 

and for all (s, t) G 7(3e/20) + with |/| > 10. 

Since the number of translates in (7.7.1) is bounded by a polynomial in (m + n) 
and since 

I(3e/20) + lw C 7(e) provided |/| < 10, 

the proof follows by Lemma 7.6. □ 

To finish the proof of Theorem 7.1 we need a concentration inequality for strictly 
log-concave probability measures. 
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(7.8) Theorem. Let V be Euclidean space with the norm || • ||, let B C V be a 
convex body, let us consider a probability measure supported on B with the density 
e~ u , where u : B — > R is a function satisfying 

U{x) + U{y) -2U ( ) > c\\x-yf for all x,y E B 



and some constant c > 0. For a point x G V and a closed subset A C V we define 
the distance 

dist(x, A) = min \\x — y\\. 

yEA 

Let Ac B be a closed set such that P (A) > 1/2. Then, for any r > we have 

pjxGB: dist(x,A)>r| < 2e~ cr \ 

Proof. See, for example, Section 2.2 of [LeOl] or Theorem 8.1 and its proof in [B97a], 
which, although stated for the Gaussian measure is adapted in a straightforward 
way to our situation. 

□ 

Here is how we apply Theorem 7.8. 

(7.9) Lemma. Let us choose < e < S 2 /10. In the space R m + n let us consider 
the hyperplane 



H = ^ (si, . . . , s m ; ti, . . . ,t n ) : ^ Sj = ^ t k 

j=l fc=l 



> . 



Let B C H be a convex body centrally symmetric about the origin: (s,t) G B if and 
only if (—s, —t) e B, and such that for all (s, t) e B we have 

\sj\ < e/r for j = 1,. . . ,m 
\tk\ < e/r for k = 1,. . . ,n. 

Let us consider the probability measure on B with the density proportional to 
\F(s,t)\. Then, for any k > we have 

p{(s,t)eB: \ Sj \,\t k \ < l < m ^i f or a u jfk \ > i_( m + n )-« 

provided m + n > 7(5, k) for some constant 7(5, k) > 0. 
Proof Let fj k be the functions defined in Section 7.3 and let 

Ujk = -lnf jk . 
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We define 

U(s,t) = a+ ^2 u jk(sj + t k ) for (s,t)eB, 

l<j<m 
l<k<n 

where a is a constant chosen in such a way that 

is a probability density on B. It follows by (7.3.4) that 

— 2~ J > fi (^ 2 (^-?/) 2 ) 
provided |y| < 2e/r. 

Let us consider the map 

M : (si, . . . , s m ; £1, . . . , t n ) \ — > (. . . s 3 + t k ■ ■ ■ ) 

as a map M : H i — > W mn . From Lemma 3.6 

\\Mx\\ 2 > min{m, n}||x|| 2 for all x e H, 

where || ■ || is the Euclidean norm in the corresponding space. It follows then by 
(7.9.1) that 

U{x) + U(y) - 2U (^y^ > & (r 2 n\\x - yf) for all x,y E B. 
Now we apply Theorem 7.8 with 

c = O (r 2 n) 

to the probability density e~ u on B. 

For j = 1,... ,m, let Sj be the set consisting of the points (s,t) G B with 
Sj > 0, let S~ be the set consisting of the points (s, t) E B with Sj < 0, let 
be the set consisting of the points (s, t) E B with t fc > and let be the set 
consisting of the points (s, t) E B with < 0. Since both S and the probability 
measure are invariant under the symmetry 

(s,t) i — > (s, -t), 

we have 

P(S+)=P(S-)=P(T+)=P(T-) = i. 
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We note that 

if \sj\ >r then dist((s,£), Sj~), dist((s,£), S~) >r and 
if \t k \ >r then dist((s,£), T+), dist((s,t), T") > r. 

Applying Theorem 7.8 with 

ln(m + ri) 
r = ^=^= 

2T\fm + n 

We conclude that for all j and k 

p|(s,i)eB: 1 8j \ > ln (^±Zi \ < exp{-0(ln 2 m)} and 
t 2r\Jm + n J 

p((M)eB: l^l>^^| < exp{-0(ln 2 n)} 

and the proof follows. □ 

Now we are ready to prove Theorem 7.1. 

(7.10) Proof of Theorem 7.1. Let us choose an < e < 8 2 /10 as in Lemma 
7.9 and let H C R m + n be the hyperplane defined in Lemma 7.9. We identify 
K>m+n-i with the hyperplane r n = in W m+n . We consider a linear transformation 
T : H — > M m + n " 1 , 

(si, • • • , s m ; £i, . . . , t n ) i >• (si + t n , . . . , s m + t n ; t\ t n , . . . , t n —\ t n , 0) . 

The inverse linear transformation 

T : . . . , s m ; £ l5 . . . , t n _ 1 , 0) i > (si, . . . , s m ; £i, . . . , t n ) 

is computed as follows: 

, _ s[ + . . . + s' m — t[ — . . . — t' n _ 1 _ , + — +' _1_ + 

m + n 

Let us consider the cube I = I(e/2) C M m + n_1 defined by the inequalities 

\sj\, \tk\ < e/2r for j = l,...,m and fe = 1, . . . ,n — 1. 
By Lemma 7.7 we have 

(7.10.1) / \F(s,t)\dsdt < exp{-0(m + n)} / \F(s,t)\ dsdt 

Jn \i Jn 

for all sufficiently large m and n. 
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Let B = T C H. Then B is centrally symmetric and convex, and for all 
(s,£) £ B we have |s_j|, < e/r for all j and k. Let 

f / s ^ ln(m + n) 

A= (s,f)6B: s 7 - < — . for j = l,...,m and 

ln(m + n) , "i 

t fc < - \ =L for k=l,...,n\. 

By Lemma 7.9, 

/ \F(s,t)\dsdt < (m + n)~ K \F(s,t)\ dsdt 
Jb\a Jb 

provided m + n > 7(5, k) for some 7(5, k) > 0. Now, the push-forward of the 
probability measure on B with the density proportional to \F(s, t) \ under the map T 
is the probability measure on / with the density proportional to \F(s, t)\. Moreover, 
the image T(A) lies in the cube U defined by the inequalities 

, , ln(m + n) 
\Sj\ \tk\ < — 1^=^= tor j = 1, . . . ,m and k = 1, . . . , n — 1. 
T\/m + n 

Therefore, 

(7.10.2) / \F(s,t) dsdt < (m + n)~ K \F(s,t)\ dsdt. 

Ji\u Ji 

provided m + n > 7(5, k) for some 7(5, k) > 0. The proof now follows by (7.10.1) 
and (7.10.2). □ 

8. Proof of Theorem 1.3 

First, we prove Theorem 1.3 assuming, additionally, that r < (m + n) 1 ^ 6 in 
(1.1.3) 

(8.1) Proof of Theorem 1.3 under the additional assumption that r is 
bounded by a polynomial in m + n. All constants implicit in the "O" and "f2" 
notation below may depend only on the parameter 5. We say that m and n are 
sufficiently large provided m + n > ■f(S) for some constant 7(5) > 0. 

As in Corollary 2.2, we represent the number C) of tables as the integral 

e g(z) r 
MR,C) = . . . / F(s,t) dsdt. 

^ > ( 27r )m+n-l J Uq V ) 

Let U C ilo be the neighborhood of the origin as defined in Theorems 6.1 and 7.1. 
From Parts (2), (3) and (4) of Theorem 6.1 we conclude that the integrals of F(s, t) 
and \F{s,t) \ over U are of the same order, that is 



J \F(s,t)\dsdt < o(^j F(s,t) dsdt^j 
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provided m + n is sufficiently large. Theorem 7.1 implies then that the integral of 
F(s,t) over n \ U is asymptotically negligible: for any k > we have 



/ 



F(s,t) dsdt 



< (m + n) K 



F(s,t) dsdt 



u 



provided m + n > 7(5, k) for some 7(5, k) > 0. 
We use Part (3) of Theorem 6.1 to compute 

F(s,t) dsdt. 

u 

Identifying M m + n ~ 1 with the hyperplane r n = in IR m+n , we note that 

7V (m+n-l)/2 



e q dsdt = 

m+n-i v /detQ|M m+n - 1 



and that by Lemma 3.5 we have 



det q I 



Dm+n-l 



det (/| if, 



where if is the hyperplane orthogonal to the null-space of q. 

To conclude the proof, we note that by Lemma 3.1 the values of 



[i = E f 2 and v = E h 



can be computed with respect to the Gaussian probability measure with the density 
proportional to e~ q in an arbitrary hyperplane L C IR m+n not containing the null- 
space of q. □ 

To handle the case of super-polynomial r, we use a result of [D+97, Lemma 3], 
which shows that #(R,C) ~ vol P(R,C) provided the margins R and C are large 
enough (it suffices to have r > (mn) 2 ). Then we note that 

vol P(aR, aC) = a ("*-i)(n-i) vol q for a > Q 

and show that the formula of Theorem 1.3 scales similarly. In the next three lemmas 
we show that the typical matrix of (aR, aC) is approximately the typical matrix 
of (R, C) multiplied by a and that the typical matrix of (R, C) is approximately 
the typical matrix of (R\ C) if R' R and C C. We then complete our proof 
of Theorem 1.3. 

In Lemmas 8.2 and 8.3 below, all implicit constants in the "O" notation are 
absolute. 
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(8.2) Lemma. Let R = (ri, . . . , r m ) and C = (ci, . . . , c n ) 6e positive (not neces- 
sarily integer) vectors such that r\ + . . . + r m = c\ + . . . + c n and let Z — (Cjk) be 
the typical matrix maximizing the the value of 

g( x )= + 1 ) ln ( x ifc + 1) - Xjfclnxjfc) 

1< j<m 
l<k<n 

on the polytope P(R, C) of m x n non-negative matrices with row sums R and 
column sums C . 
Let 



mm 7j, c_ = 

j = l,... ,m fc=l,... ,n 



min Cfc anrf 



r+ = max rj, c+ = max cu- 

j=l,...,m k=l,...,n 



Then 



Cjk > 



r_c. 



r_|_m 



anc? Qk ^ 



c_r_ 



c + n 



/or all j, k. 



Proof. This is Part (1) of Theorem 3.5 (Theorem 3.3 of the journal version) of 
[B+08]. □ 

(8.3) Lemma. Let Z = (Cjk) be the m x n typical matrix of margins (R, C) such 
that 

St < Cjk < t for all j, k, 

for some < 5 < 1/2 and some r > 1. 

Let < a < 1 and let X = (£jk) be the typical matrix of margins (aR, aC). 
Then the following holds: 

(1) We have 

ijk > a5 2 r for all j, k. 

(2) Suppose ad 2 r > 1. Then 



g(Z) + ran In a — g(X) 



= O 



mn 



ad 2 r 



(3) There exists an absolute constant 7 > 1 such that if ad 4 r > ^mn then 



ijk - ctCjk 



< eaCjk for all j, k 



and 



e = 



mn 
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Proof. Let R = (r\, . . . , r m ) and C = (ci, . . . , c n ). Thus in Lemma 8.2 we have 

r_ > o~nr, c_ > dmr and r + < rn. 

Applying Lemma 8.2 to the scaled margins (aR, aC), we obtain Part (1). 
Since aZ G P(aR, aC) and a~ x X G -P(-R, C), we have 

(8.3.1) #(Z) > ^(a- 1 ^) and g(X) > g{aZ). 

Since for x > 1 we have 

g(x) =(x + 1) ln(x + 1) — x Inx + (x + 1) lna; — (x + 1) lnx 

(8.3.2) 



=(:r + 1) In ^1 + - J + Inx = 1 + Inx + O J , 

from (8.3.1) we obtain Part (2). 

Let us consider the interval [X,aZ] C P(aR : aC). Since g is concave, we have 

(8.3.3) \ g {Y)-g{X)\ = 0{^) for all Ye[X,aZ\. 

Suppose that for some 0<e< 1/2 we have 

\€jk - ®(jk\ > e«Ofc for some j, k. 

Then there is a matrix Y G [X, aZ], Y = (r]jk), such that \rfjk — «Cjfc| = ea (jk- We 
note that 

x(x + 1) 

and, in particular, 

g"{x) < - g 2 2 for a11 x e • 

Next, we are going to exploit the strong concavity of g and use the following stan- 
dard inequality: if g"{x) < —f3 for some (3 > and all a < x < b then 

fa + b\ 1 , , 1 m-a) 2 
9{—)-- 2 9(a)-- 2 g(b) > LX. 

Applying the above inequality to g with a = r^fc, b = aQk and /3 = l/8a 2 r 2 , we 
obtain 



9 ( 2 J ~ 2^ ^ ~ 2^ " 



64 

Let W = (y + c*Z)/2. Then W G [X, aZ] and by (8.3.3) 

e 2 5 2 „ / ran 



»(H0 > 9W+ — -0(-=-) 



64 Va5 2 r, 

Since g{W) < g(X), the proof follows. □ 
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(8.4) Lemma. Let Z = (Cjk) be the m x n typical matrix of margins R = 
(ri, . . . , r m ) and C = (ci, . . . , c n ) such that 

St < Qk < t for all j, k, 

for some < 5 < 1/2 and some r > 1. 

Let < e < 1/2 and Ze£ X = (£jk) be the typical matrix of some margins 
R = (r[, . . . , r' m ) and C = (c' l7 . . . , c' n ) such that 

(1 — e)rj < r'j < rj for j = 1, . . . , m and 
(1 - e)c fc < 4 < c fc /or fc = 1, . . . , n. 

Suppose that S 2 t > 1. Then the following holds: 

(1) We have 

g(X)-g(Z)\=0(mne). 

(2) There exists an absolute constant 7 such that if e < r yd 2 /mn then 

I C?fc — Cj Jfc I < /SCjfc /or a// j, fc 

and 

lmne\ 



p = o 



5 2 J 



Proof. Let A = (ai,... ,a m ) and £? = (61,... ,b n ) be margins and let A' = 
(a' l7 . . . , a^J and B' = (b^, . . . , b' n ) be some other margins such that a'- < and 
fr/c < ^fc for all j and fc. Then there exists a non-negative matrix D with margins 
a,- -a£ and b k -b' k and for such a L> we have Y + D C P(A, B) for all Y E P(A', C). 
Snce g is monotone increasing, we obtain 

9 ((l-e)Z) < re(i max Rc)S (y) < g (X) = ^ ^ S (Y) 
£ ^Sc, 9(F)=s(Z) ' 

Hence 

</((l-e)Z) < o(X) < </(Z) 

and from (8.3.2) we deduce Part (2). 

We note that Z is the maximum point of g on the polytope of non-negative mxn 
matrices with the row sums not exceeding R and column sums not exceeding C. 
Therefore, 

(8.4.1) g(Y)-g(Z) = O(mne) for all Ye[X,Z]. 
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Suppose that for some < /3 < 1/2 we have 

\€jk ~ Cjk\ > PGjk for some j, k. 

Then there is a matrix Y e [X, Z], Y = (r]jk), such that \rjjk — Cjk\ = PCjk- As in 
the proof of Lemma 8.3, we argue that 

g"{x) < for a11 x e fe' fc ' Ofc] 

and that 

'Vjk + (jk\ 1 / x 1 /> x . /W 



.'/ ^ g J ~y! fok) ~ Ty'l (tjk) > G , 

Let W = (y + Z)/2. Then W G [y, Z] and by (8.4.1) 

g(W) > g(Z) + ^-0(mne). 

Since giW) < g(Z), the proof follows. □ 

(8.5) Proof of Theorem 1.3. All implicit constants in the "O" and "O" notation 
below may depend on parameter 5 only. 

In view of Section 8.1, without loss of generality we assume that r > (m + n) 10 
in (1.1.3). As follows by [D+97], as long as r > (m + n) 2 we have 

#(#, C) = vol P(R, C)(l + 



m + n 



where vol P(R, C) is the volume of the polytope of the set ofmxn non-negative 
matrices with row sums R and column sums C normalized in such a way that 
the volume of the fundamental domain of the (m — l)(n — l)-dimensional lattice 
consisting of the m x n integer matrices with zero row and column sums is equal 
to 1. 

Let a = (m + n) 9 r _1 and let 

R = (fi, . . . ,f m ) and C = {c u . . . , c n ) 

be positive integer margins (so f i + . . . + f m = c\ + . . . + c n ) such that 

(1 — e)arj < fj < arj and (1 — e)ack < Ck < acjt 
for some < e < (m + n)~ 7 . 

Then 

#(#, C) ^volP(#, C) « a (m - 1)(1 - n) volP(^,(7) 
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(8.5.2) 



where denotes the equality up to a O ((m + n) -1 ) relative error. 

Let Z = be the typical matrix of margins (R, C) and let Z = (Cjk^j ^ e tne 
typical matrix of margins (J2, (7). By Lemmas 8.3 and 8.4, we have 

\g(Z) + mn In a — g(Z) | = ^ — — yg-^j and 

| Ofc - «Ofc | = O ( a f jfc v 2 ) for all j, fc. 

Let q, q : R m + n — y M be the quadratic forms associated by (1.2.1) with margins 
(.R, C) and (J2, C) respectively. Then by the second estimate of (8.5.2) it follows 
that 

q(s,t) ^a 2 q(s,t) for all (s,t)eM m+n , 

where "~" stands for the equality up to a O ((m + n)~ 2 ^ relative error. It follows 
then by the first estimate of (8.5.2) that the Gaussian term (1.3.1) for margins 
(R,C), up to a relative error of 0((m + n) _1 ), is obtained by multiplying the 
Gaussian term for margins (R, C) by a^ m ~ 1 ^ 1 ~ n \ 

Similarly, we show that the Edgeworth correction factor (1.3.2) changes negligi- 
bly as we pass from (R, C) to (R, C). By making substitutions 

(s,£)i — >r~ 1 (s,t) and (s,t) i — > a _1 r _1 (s, t) 

respectively, we express the quantities v) for margins (R, C) and (ft, v) for mar- 
gins (J2, C) as 

^ = E/ 2 , z/ = E/i and fi = Ef 2 ,u = Eh, 

where the expectations \x and v are taken with respect to the Gaussian measure on 
H with the density proportional to and the expectations jl and v are taken with 
respect to the Gaussian measure with the density proportional to e - ^, where ip and 
ifj are positive definite quadratic forms within a relative error of O ((m + n)~ 2 ) of 
each other. Moreover, f 2 and f 2 are homogeneous polynomials of degree 6 and h 
and h are homogeneous polynomials of degree 4 such that 

f 2 (s,t), f 2 (s,t) = I 5>j + **I 
/i(s, £), ft(s, £) = O (sj + tfc) 4 J and 

/»(«, t) - t) | = o ( | 2 ) E ( a i + **) 4 • 

55 



Since by Lemma 3.6, the minimum eigenvalues of ip and ifj are 0(m + n), stan- 
dard estimates imply that exp{— fx/ 2 + v\ approximates exp{— fi/2 + 0} within a 
O ((m + relative error. 

We have 

C jk = 0((m + n) 9 ) for all j, k, 

and hence by the result of Section 8.1 we can apply Theorem 1.3 to estimate 
#(R, C). The proof then follows by (8.5.1). □ 
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